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Abstract 

Heuristics are crucial tools in decreasing search ef- 
fort in varied fields of AI. In order to be effective, 
a heuristic must be efficient to compute, as well as 
provide useful information to the search algorithm. 
However, some well-known heuristics which do 
well in reducing backtracking are so heavy that the 
gain of deploying them in a search algorithm might 
be outweighed by their overhead. 

We propose a rational metareasoning approach to 
decide when to deploy heuristics, using CSP back- 
tracking search as a case study. In particular, a 
value of information approach is taken to adaptive 
deployment of solution-count estimation heuristics 
for value ordering. Empirical results show that 
indeed the proposed mechanism successfully bal- 
ances the tradeoff between decreasing backtracking 
and heuristic computational overhead, resulting in 
a significant overall search time reduction. 



1 Introduction 

Large search spaces are common in artificial intelligence, 
heuristics being of major importance in limiting search ef- 
forts. The role of a heuristic, depending on type of search 
algorithm, is to decrease the number of nodes expanded (e.g. 
in A* search), the number of candidate actions considered 
(planning), or the number of backtracks in constraint satisfac- 
tion problem (CSP) solvers. Nevertheless, some sophisticated 
heuristics have considerable computatio nal overhead, signif- 
icantly decreasing their overall effect iHorsch and Havens, 



2000 |Kaskef ai, 2004) , even causing increased total runtime 
in pathological cases. It has been recognized that control of 
this overhead can be essential to improve search performance; 
e.g. by selecting which heuristics to evaluate in a manner de 



pendent on the state of the search I Wallace and Freuder, 1992 
Domshlak etal, 2010] . 



We propose a ra tional metareasoning approach I Russell 



and Wefald, 1991) to decide when and how to deploy 
heuristics, using CSP backtracking search as a case study. 
The heuristics examined are various solution count estimate 
heuristics for v alue ordering jMeiselsef ai, 1997 |Horsch and] 



nificantly decrease the number of backtracks. These heuris- 
tics make a good case study, as their overall utility, taking 
computational overhead into account, is sometimes detrimen- 
tal; and yet, by employing these heuristics adaptively, it may 
still be possible to achieve an overall runtime improvement, 
even in these pathological cases. Following the metareason- 
ing approach, the value of information (VOI) of a heuristic is 
defined in terms of total search time saved, and the heuristic 
is computed such that the expected net VOI is positive. 

We begin with background on metareasoning and CSP 
(Section [2}, followed by a re-statement of value ordering in 
terms of rational metareasoning (Section [3}, allowing a def- 
inition of VOI of a value-ordering heuristics — a contribu- 
tion of this paper. This scheme is then instantiated to han- 
dle our case-study of backtracking search in CSP (Section[4]i, 
with parameters specific to value-ordering heuristics based on 
solution-count estimates, the main contribution of this paper. 
Empirical results (Section [5]l show that the proposed mech- 
anism successfully balances the tradeoff between decreasing 
backtracking and heuristic computational overhead, resulting 
in a significant overall search time reduction. Other aspects of 
such tradeoffs are also analyzed empirically. Finally, possible 
future extensions of the proposed mechanism are discussed 
(Section^, as well as an examination of related work. 

2 Background 

2.1 Rational metareasoning 



In rational metareasoning [ Russell and Wefald, 199l") , a 



problem-solving agent can perform base-level actions from 
a known set {Ai}. Before committing to an action, the agent 
may perform a sequence of meta-level "deliberation" actions 
from a set {Sj}. At any given time there is an "optimal" base- 
level action, A a , that maximizes the agent's expected utility: 



txgmaxY, P(W k )U(Ai,W k ) 



(1) 



Havens, 2000| , which are expensive to compute, but can sig- 



where {Wfc} is the set of possible world states, U (Ai, Wk) is 
the utility of performing action Aj in state Wk, and P(Wk) is 
the probability that the current world state is Wk- 

A meta-level action provides information and affects the 
choice of the base-level action A a . The value of information 
(VOI) of a meta-level action Sj is the expected difference 
between the expected utility of Sj and the expected utility 



of the current A a , where P is the current belief distribution 
about the state of world, and P J is the belief-state distribution 
of the agent after the computational action Sj is performed, 
given the outcome of Sj : 

V(S,) = E P (E pj (U(Sj)) - E Pi (U(A a ))) (2) 

Under certain assumptions, it is possible to capture the de- 
pendence of utility on time in a separate notion of time cost 
C. Then, Equation (|2]i can be rewritten as: 

V(S j ) = A(S j )-C(S j ) (3) 

where the intrinsic value of information 

A(S 3 ) = E P (E P3 (U(Ai)) - E pj (U(A a ))) (4) 

is the expected difference between the intrinsic expected util- 
ities of the new and the old selected base-level action, com- 
puted after the meta-level action is taken. 

2.2 Constraint satisfaction 

A constraint satisfaction problem (CSP) is defined by a set 
of variables X = {X\, X 2 , ■■■}, and a set of constraints 
C = {Ci, C2, ■■■}■ Each variable X{ has a non-empty domain 
Di of possible values. Each constraint C,; involves some sub- 
set of the variables — the scope of the constraint — and speci- 
fies the allowable combinations of values for that subset. An 
assignment that does not violate any constraints is called con- 
sistent (or a solution). There are numerous variants of CSP 
settings and algorithmic paradigms. This paper focuses on 
binary CSPs over discrete-valu es variables, and backtracking 
search algorithms I Tsang, 1993| . 

A basic method used in numerous CSP searc h algorithm 
is that of mai ntaining arc consistency (MAC) [Sabin and 
Freuder, 1997| . There are several versions of MAC; all share 



the common notion of arc consistency. A variable Xi is arc- 
consistent with Xj if for every value a of Xi from the domain 
Di there is a value b of Xj from the domain Dj satisfying the 
constraint between Xi and Xj . MAC maintains arc consis- 
tency for all pairs of variables, and speeds up backtracking 
search by pruning many inconsistent branches. 

CSP backtracki ng search algo rithms typically employ both 
variable ordering flTsang, 1993) and value or dering heuris - 
tics. The latter type include minimum conflicts 1 Tsang, 1993| , 
which orders values by the number of confl icts they cause 
with unassigned variables, Geelen's promise iGeelen, 19921 
— by the p roduct of domain sizes, and minimum impact I Re- 
falo, 2004 1 orders values by relative impact of the value as- 
signment on the product of the domain sizes. 

Some v alue-ordering heuristics are based on solution count 
estimates iMeisels et al., 19971 |Horsch an d Havens, 2000] 
Kask et al., 2004) : solution counts for each value assign- 
ment of the current variable are estimated, and assignments 
(branches) with the greatest solution count are searched first. 
The heuristics are based on the assumption that the estimates 
are correlated with the true number of solutions, and thus a 
greater solution count estimate means a higher probability 
that a solution be found in a branch, as well as a shorter search 
time to find the first solution if one exists in that branch. 
QMeisels et al, 1997) estimate solution counts by approxi- 
mating marginal probabilities in a Bayesian network derived 



from the constraint graph; iHorsch and Havens, 20001 pro- 
pose the probabilistic arc consistency heuristic (pAC) based 
on iterative belief propagation for a better accur acy of rela- 
tive solution count estimates; I Kask et al, 20041 adapt Iter- 
ative Join-Graph Propagation to solution counting, allowing 
a tradeoff between accuracy and complexity. These meth- 
ods vary by computation time and precision, although all are 
rather computationally heavy. Principles of rational metarea- 
soning can be applied independently of the choice of imple- 
mentation, to decide when to deploy these heuristics. 

3 Rational Value-Ordering 

The role of (dynamic) value-ordering is to determine the or- 
der of values to assign to a variable X/. from its domain Dk, 
at a search state where values have already been assigned 
to (Xi, Xk-i). We make the standard assumption that 
the ordering may depend on the search state, but is not re- 
computed as a result of backtracking from the initial value 
assignments to Xk'. a new ordering is considered only after 
backtracking up the search tree above X/.. 

Value ordering heuristics provide information on future 
search efforts, which can be summarized by 2 parameters: 

• Ti — the expected time to find a solution containing as- 
signment Xk = yki or verify that there are no such so- 
lutions; 

• Pi — the "backtracking probability", that there will be no 
solution consistent with Xk = yki- 

These are treated as the algorithm's subjective probabilities 
about future search in the current problem instance, rather 
than actual distributions over problem instances. Assuming 
correct values of these parameters, and independence of back- 
tracks, the expected remaining search time in the subtree un- 
der Xk for ordering lu is given by: 



E 

i=2 



T„ 



.7=1 



0) 



(5) 



In terms of rational metareasoning, the "current" optimal 
base-level action is picking the u) which optimizes T s l". 
Based on a general pro perty of functions on sequences 
i Monmaand Sidney, 1979) , it can be shown that T 8 !" is min- 



imal if the values are sorted by increasing order of jf^r ■ 

A candidate heuristic H (with computation time T H ) gen- 
erates an ordering by providing an updated (hopefully more 
precise) value of the parameters Ti,pi for value assignments 
Xk = Vki> which may lead to a new optimal ordering ujjj, 
corresponding to a new base-level action. The total expected 
remaining search time is given by: 

T = T H +E[T S ^ H ] (6) 
Since both T H (the "time cost" of H in metareasoning 
terms) and T s],UH contribute to T, even a heuristic that im- 
proves the estimates and ordering may not be useful. It may 
be better not to deploy H at all, or to update Tj , pi only for 
some of the assignments. According to the rational metarea- 
soning approach (Section 2.1 1, the intrinsic VOI Aj of esti- 
mating Ti,pi for the ith assignment is the expected decrease 



in the expected search time: 



4 VOI of Solution Count Estimates 



A, = E 



rps\u)- rps\iO-\-i 



(7) 



where w_ is the optimal ordering based on priors, and cu +i on 
values after updating Ti,pi. Computing new estimates (with 
overhead T c ) for values Ti,pi is beneficial just when the net 
VOI is positive: 

Vi = A; - T c (8) 

To simplify estimation of A,, the expected search time of an 
ordering is estimated as though the parameters are computed 
only for u>-(l) (essentially the metareasoning subtree inde- 
pendence assumption). Other value assignments are assumed 
to have the prior ("default") parameters Id e f,Pdef- Assume 
w.l.o.g. thatw_(l) = 1: 



'Pi 



\D k \ 

E 

i=2 



T dclP d J = T l + Pl T def 



1 „(P*|-1) 
1 Pdef 



1 - Pdef 



and the intrinsic VOI of the ith deliberation action is: 

Ti Ti 



Aj = E 



G(T i)Pi ) 



< 



1 - Pi 1 - Pi 



(9) 



(10) 



where G(T, , pj) is the search time gain given the heuristically 
computed values T,pi: 



G{Ti,pi) = Ti - Ti + (pi - Pi)T del - 



Pdcl 



(ID 



1 - Pdcf 

In some cases, H provides estimates only for the expected 
search time Tj. In such cases, the backtracking probability pt 
can be bounded by the Markov inequality as the probability 
for the given assignment that the time t to find a solution or 
verify that no solution exists is at least the time T" 1 to find 
all solutions: pi = P (t> Xjf") < t^tt, and the bound can 
be used as the probability estimate: 

T 



Pi 



(12) 



Furthermore, note that in harder problems the probability 
of backtracking from variable X/. is proportional top^B , 
and it is reasonable to assume that backtracking probabilities 
above Xf. (trying values for X\, ...,Xk-i) are still signifi- 
cantly greater than 0. Thus, the "default" backtracking prob- 
ability pdcf is close to 1, and consequently: 



T 



■all 



T, 



1 



dcf , 



JIAsI" 1 ) 
Pdcf 



\D k 



1 



(13) 



1 - Pdci 

By substituting ([12]), ([13]) into (TTTJ, estimate (|T4j» for 



G(Ti,pi) is obtained 

G{T llPl ) « 

« (Ti — Ti)\D, 



11 ) T def 



1 ^dcf 
1 - Pdef 

(14) 



Finally, since dl2l, d 1 3 b imply that T < T x < 



A* wE 



<Ti 



(15) 



The estimated solution count for an assignment may be used 
to estimate the expected time to find a solution for the assign- 
ment under the following assumptions^] 

1. Solutions are roughly evenly distributed in the search 
space, that is, the distribution of time to find a solution 
can be modeled by a Poisson process. 

2. Finding all solutions for an assignment Xk = yki takes 
roughly the sam e time for all assignments to the variable 
X k . Prior work ]Meisels et al, 1997|pCask et al, 2004[ 
demonstrates that ignoring the differences in subprob- 
lem sizes is justified. 

3. The expected time to find all solutions for an assignment 
divided by its solution count estimate is a reasonable es- 
timate for the expected time to find a single solution. 

Based on these assumptions, Tj can be estimated as rjj^ 

where T al1 is the expected time to find all solutions for all 
values of Xk, and m is the solution count estimate for y^; 
likewise, 7\ = \ D ^\ n — , where n max is the currently greatest 
T%i. By substituting the expressions for T,-, Xi into ( 15 1, obtain 
as the intrinsic VOI of computing n,; : 



A; 



rpall ^ ^ 



1 



1 )P{n,v) (16) 

n 



where P(n, v) 



3^^- is the probability, according to the 
Poisson distribution, to find n solutions for a particular as- 
signment when the mean number of solutions per assignment 
is v = p^tt, and N is the estimated solution count for all 
values of Xk, computed at an earlier stage of the algorithm. 

Neither T aU nor T c , the time to estimate the solution count 
for an assignment, are known. However, for relatively low so- 
lution counts, when an invocation of the heuristic has high 
intrinsic VOI, both T al1 and T c are mostly determined by 
the time spent eliminating non-solutions. Therefore, T° can 

be assumed approximately proportional to jjy^i, the average 
time to find all solutions for a single assignment, with an un- 
known factor 7 < 1. 



nail 



D k \ 



(17) 



Then, T aU can be eliminated from both T c and A. Following 
Equation ([8]), the solution count should be estimated when- 
ever the net VOI is positive: 



V(n max ) oc \D k \e u 



OC 

E 



i 



1\ v n 



1 (18) 



The infinite series in ( [18] ) rapidly converges, and an approx- 
imation of the sum can be computed efficiently. As done in 

'We do not claim that this is a valid model of CSP search; rather, 
we argue that even with such a crude model one can get significant 
runtime improvements. 



Section [5] 7 can be learned offline from a set of problem in- 
stances of a certain kind for the given implementation of the 
search algorithm and the solution counting heuristic. 

Algorithm [TJ implements rational value ordering. The pro- 
cedure receives problem instance csp with assigned values for 
variables X\, ...,X k -i, variable X k to be ordered, and esti- 
mate N of the number of solutions of the problem instance 
(line[T]i; N is computed at the previous step of the backtrack- 
ing algorithm as the solution count estimate for the chosen 
assignment for X k _\, or, if k = 1, at the beginning of the 
search as the total solution count estimate for the instance. 
Solution counts estimates n,; for some of the assignments are 



re-computed (lines 
by non-increasing so 



and then the domain of Xk, ordered 
ution count estimates of value assign- 



ments, is returned (lines fTTHT2 



Algorithm 1 Value Ordering via Solution Count Estimation 
1: procedure ValueOrdering-SC(csp, X k ,N) 

^max 
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5 
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7: 
8: 
9 
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11 
12 



D 



N 
\D\ 

for all i in l..\D\ do <■ 
while V(n m£LX ) > do 

choose y ki £ D arbitrarily 

D^D\{y ki } 
csp' «- csp with D k = {y ki } 
n l <- EstimateSolutionCount(csp') 
if rii > n max then n max <- n t 
end while 

D rd ^- sort D k by non-increasing n,; 
return D orc t 



19971 . The source code is available from http: / /ftp. 
davidashen . net/vsc . tar . gz| 



5.1 Benchmarks 



a. Search time 
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b. Number of backtracks 
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c. Solution count estimations 



5 Empirical Evaluation 

Specifying the algorithm parameter 7 is the first issue. 7 
should be a characteristic of the implementation of the search 
algorithm, rather than of the problem instance; it is also desir- 
able that the performance of the algorithm not be too sensitive 
to fine tuning of this parameter. 

Most of the experiments were conducted on sets of random 
problem i nstances generated according to Model RB ]Xu and 
Li, 2000) . The empirical evaluation was performed in two 
stages. In the first stage, several benchmarks were solved 
for a wide range of values of 7, and an appropriate value for 
7 was chosen. In the second stage, the search was run on 
two sets of problem instances with the chosen 7, as well as 
with exhaustive deployment, and with the minimum conflicts 
heuristic, and the search time distributions were compared for 
each of the value ordering heuristics. 

The AC-3 version of M AC was used for the exp eriments, 
with some modifications [Sa bin and Freuder, 1997[ . Vari- 
ables were ordered using the maximum degree variable or- 
dering heuristic]^] The solution counting heur istic was based 
on the solution count estimate proposed in [Meisels et ah, 



2 A dynamic variable ordering heuristic, such as dom/deg, may| 
result in shorter search times in general, but gave no significant im- 



provement in our experiments; on the other hand, static variable or- 
dering simplifies the analysis. 



Figure 1 : Influence of 7 in CSP benchmarks 

CSP benchmarks from CSP Solver Competition 2005 
i Boussemart et ah, 2 0051 were used. 14 out of 26 bench- 
marks solved by at least one of the solvers submitted for the 
competition could be solved with 30 minutes timeout by the 
solver used for this empirical study for all values of 7: 7 = 
and the exponential range 7 G {10 -7 , 10 -6 , 1}, as well as 
with the minimum-conflicts heuristic and the pAC heuristic. 

Figure [TJa shows the mean search time of VOTdriven so- 
lution count estimate deployment Tysc normalized by the 
search time of exhaustive deployment Tsc (7 = 0), for the 
minimum conflicts heuristic Tmc> an d f° r tne pAC heuristic 
Tpac- The shortest search time on average is achieved by 
VSC for 7 G [10~ 4 , 3 ■ 10~ 3 ] (shaded in the figure) and is 

much shorter than for SC (mean ( Tvs g^°" 3) ) ~ 0.45); the 
improvement is actually close to getting all the information 
provided by the heuristic without paying the overhead at all. 
For all but one of the 14 benchmarks the search time for VSC 
with 7 = 3- 10~ 3 is shorter than for MC. For most values 
of 7, VSC gives better results than MC (^Qf < 1)- pAC 
always results in the longest search time due to the computa- 
tional overhead. 

Figure [TJb shows the mean number of backtracks of VOT 
driven deployment Nysc normalized by the number of back- 
tracks of exhaustive deployment N$c, the minimum conflicts 



heuristic Nmc, an d f° r the pAC heuristic N p ac- VSC causes 

< 1). pAC 



3 (JSh, 



less backtracking than MC for 7 < 3 - 10 3 ( 
always causes less backtracking than other heuristics, but has 
overwhelming computational overhead. 

Figure [T]c shows Cvsc, the number of estimated solu- 
tion counts of VOI-driven deployment, normalized by the 
number of estimated solution counts of exhaustive deploy- 
ment Csc- When 7 = 10~ 3 and the best search time is 
achieved, the solution counts are estimated only in a rela- 
tively small number of search states: the average number of 
estimations is ten times smaller than in the exhaustive case 

CVsc(10~ 3 ) S 



(mean ( r 



0.099, median 



( C VS c(W- 3 ) \ 
\ C sc ) 



CI ' ' \J.\J*J<J. XXX^XXCXiXX 1 r , 
SC ) \ L-SC 

0.048). 

The results show that although the solution counting 
heuristic may provide significant improvement in the search 
time, further improvement is achieved when the solution 
count is estimated only in a small fraction of occasions se- 
lected using rational metareasoning. 

a. Easier instances 
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Figure 2: Search time comparison on sets of random in- 
stances 



5.2 Random instances 

Based on the results on benchmarks, we chose 7 = 10~ 3 , 
and applied it to two sets of 100 problem instances. Exhaus- 
tive deployment, rational deployment, the minimum conflicts 
heuristic, and probabilistic arc consistency were compared. 

The first, easier, set was generated with 30 variables, 30 
values per domain, 280 constraints, and 220 nogoods per con- 
straint. Search time distributions are presented in Figure [2]a. 
The shortest mean search time is achieved for rational de- 
ployment, with exhaustive deployment next { =- sc w 1.75), 

Ivsc 

~. - « 2.16) 

i vsc 

3.42). Addition- 
ally, while the search time distributions for solution counting 
1.08, m ^ Tv,sc w 1.73), the distribu- 



followed by the minimum conflicts heuristic (= 

and probabilistic arc consistency ( ^ pAC 

Tvsc 



tion for the minimum conflicts heuristic has a long tail with a 
much longer worst case time ( m ^ Tysc w 5.67). 

Tvsc 

The second, harder, set was generated with 40 variables, 
19 values, 410 constraints, 90 nogood pairs per constraint. 
Search time distributions are presented in Figure |2]b. As 
with the first set, the shortest mean search time is achieved 
for rational deployment: ^ sc s» 1.43, while the relative 

Tvsc 

mean search time for the minimum conflicts heuristic is much 



longer: 



Tmc 



3.45. The probabilistic arc consistency 



Tvsc 

heuristic resulted again in the longest search time due to the 
overhead of computing relative solution count estimates by 
loopy belief propagation: m ^ Tvsc w 3.91. 

Ivsc 

Thus, the value of 7 chosen based on a small set of hard in- 
stances gives good results on a set of instances with different 
parameters and of varying hardness. 

5.3 Generalized Sudoku 

Randomly generated problem instances have played a key 
role in the design and study of heuristics for CSP. How- 
ever, one might argue that the benefits of our scheme are 
specific to model RB. Indeed, real- world problem instances 
often have much more structure than random instances gen- 
erated according to Model RB. Hence, we repeated the exper- 
iments on randomly gene rated Generalized Sudoku instances 
lAnsotegui et ah, 20061, since this domain is highly struc- 



tured, and thus a better source of realistic problems with a 
controlled measure of hardness. 

The search was run on two sets of 100 Generalized Su- 
doku instances, with 4x3 tiles and 90 holes and with 7x4 tiles 
and 357 holes, with holes punche d using the doubly balanced 
method QAnsotegui et al, 2006| . The search was repeated 
on each instance with the exhaustive solution-counting, VOI- 
driven solution counting (with the same value of 7 = 10~ 3 as 
for the RB model problems), minimum conflicts, and proba- 
bilistic arc consistency value ordering heuristics. Results are 
summarized in Table [Tj and show that relative performance of 
the methods on Generalized Sudoku is similar to the perfor- 
mance on Model RB. 
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7x4, 357 holes 


21.328 
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Table 1: Generalized Sudoku 
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Deployment patterns 

One might ask whether trivial methods for selective deploy- 
ment would work. We examined deployment patterns of VOI- 
driven SC with (7 = 10 -3 ) on several instances of different 
hardness. For all instances, the solution counts were esti- 
mated at varying rates during all stages of the search, and the 
deployment patterns differ between the instances, so a simple 
deployment scheme seems unlikely. 

VOI-driven deployment also compares favorably to ran- 
dom deployment. Table [2] shows performance of VOI-driven 



deployment for 7 = 1CP 3 and of uniform random deploy- 
ment, with total number of solution count estimations equal 
to that of the VOI-driven deployment. For both schemes, the 
values for which solution counts were not estimated were or- 
dered randomly, and the search was repeated 20 times. The 
mean search time for the random deployment is sa 1.6 times 
longer than for the VOI-driven deployment, and has m 100 
times greater standard deviation. 





mean(T), sec 


median (T), sec 


sd(T), sec 


VOI-driven 


19.841 


19.815 


0.188 


random 


31.421 


42.085 


20.038 



Table 2: VOI-driven vs. random deployment 



6 Discussion and related work 



The p ri nciples of bounded rationa lity appear in iHorvitz, 
1987) . iRussell and Wefald, 199TJ provided a formal de- 
scription of rational metareasoning and case studies of appli- 
cations in several problem domains. A typical use of rational 
metareasoning in search is in finding which node to expand, 
or in a CSP context determining a variable or value assign- 
ment. The approach taken in this paper adapts these methods 
to whether to spend the time to compute a heuristic. 

Runtime selection of heuristics has late ly been of inter- 
est, e.g. deploying heuristics for planning QDomshlak et al. 



2010 1. The approach taken is usually that of learning which 
heuristics to deploy based on features of the search state. Al- 
though our approach can also benefit from learning, since we 
have a parameter that needs to be tuned, its value is mostly al- 
gorithm dependent, rather than problem-instance dependent. 
This simplifies learning considerably, as opposed to having 
to leam a classifier from scratch. Comparing metareasoning 
techniques to learning techniques (or possibly a combination 
of both, e.g. by learning more precise distribution models) is 
an interesting issue for future research. 

Although rational metareasoning is applicable to other 
types of heuristics, solution-count estimation heuristics are 
natural can didates for the type of o ptimization suggested in 
this paper. [Dechter and Pearl, 1987| first suggested solution 
count estimates as a value-ordering heuristic (using propaga- 
tion on trees) for con straint satisfaction problems, refined in 
[Meisels et al, 19971 to multi-path propagation. 

l |Horsch a nd Havens, 2000] used a value-ordering heuristic 
that estimated relative solution counts to solve constraint sat- 
isfaction problems and demonstrated efficiency of their algo- 
rithm (called pAC, probabilistic Arc Consistency). However, 
the computational overhead of the heuristic was larg e, and 
the relativ e solution counts were computed offline. | |Kask et 
ah, 20041 introduced a CSP algorithm with a solution count- 
ing heuristic based on the Iterative Join-Graph Propagation 
(IJGP-SC), and empirically showed performance advances 
over MAC in most cases. In several cases IJGP-SC was still 
slower than MAC due to the computat ional overh ead. 

Impact-based value ordering fRefalo, 2004) is another 
heavy informative h euristic. One way to decrease its over- 
head, suggested in [Refalo, 20041, is to leam the impact of 



an assignment by averaging the impact of earlier assignments 
of the same value to the same variable. Rational deployment 
of this heuristic by estimating the probability of backtracking 
based on the impact may be p ossible, an issue for future re- 
search. QGomes et al, 2007J propose a technique that adds 
random generalized XOR constraints and counts solutions 
with high precision, but at present requires solving CSPs, thus 
seems not to be immediately applicable as a search heuristic. 

The work presented in this paper differs from the above re- 
lated schemes in that it does not attempt to introduce new 
heuristics or solution-count estimates. Rather, an "off the 
shelf" heuristic is deployed selectively based on value of in- 
formation, thereby significantly reducing the heuristic's "ef- 
fective" computational overhead, with an improvement in 
performance for problems of different size and hardness. 

In summary, this paper suggests a model for adaptive de- 
ployment of value ordering heuristics in algorithms for con- 
straint satisfaction problems. As a case study, the model was 
applied to a value ordering heuristic based on solution count 
estimates, and a steady improvement in the overall algorithm 
performance was achieved compared to always computing 
the estimates, as well as to other simple deployment tactics. 
The experiments showed that for many problem instances the 
optimum performance is achieved when solution counts are 
estimated only in a relatively small number of search states. 
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